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ABSTRACT. In this paper, we introduce a regression model using dummy variables within the framework of 
neutrosophic statistics. This model is designed for regression analysis under conditions of uncertainty, extending the 
classical regression model with dummy variables. We also present regression and analysis of variance under 
neutrosophic statistics. The application of our model is demonstrated through simulation and comparative studies, 
showing that the results differ from those obtained using classical regression. Our findings indicate that the degree of 


uncertainty significantly impacts the predicted and residual values. 


1. Introduction 

In general, regression analysis does not account for categorical variables in modeling and 
prediction. To address this, regression with dummy variables is employed. In this method, 
specific categories are assigned to variables, and regression models are created for these 
categories to facilitate prediction. The primary advantage of using dummy variables is that they 
enable the inclusion of categorical data in the analysis. The application of regression with 
dummy variables for analyzing structural change is demonstrated by [1]. [2] provides a detailed 
overview of dummy variables in regression analysis. [3] applied this technique for rainfall 
forecasting. [4] used it for insurance data analysis. [5] explored its use in a probabilistic 


environment with applications in quality control. [6] evaluated and applied the regression with 
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dummy variables. [7] presented its application in analyzing students' learning data. Further 
applications can be found in [8], [9], [10] and [11]. 

Neutrosophic statistics, a branch of mathematical science, is essential for managing, analyzing, 
presenting, and interpreting uncertain data. This field extends classical statistics by integrating 
degrees of uncertainty often neglected in traditional methods. Introduced by [12], neutrosophic 
statistics has shown increased flexibility for dealing with imprecise data, as evidenced by 
numerous subsequent studies. Recent research highlights the effectiveness of neutrosophic 
statistical analysis, especially with complex or e-commerce data. Significant contributions 
include studies by [13], [14], and [15], and [16]. [17] explored neutrosophic multiple regression 
analysis, while [18] examined split-plot design for neutrosophic data analysis. Additionally, [19] 
looked into the analysis of covariance for imprecise data, and [20] studied neutrosophic 
statistical analysis in the context of temperature variations across cities. [21] introduced 
neutrosophic kernel regression for mean estimation, further broadening the applications of 
neutrosophic statistical methods. 

Upon reviewing the literature, we discovered a substantial body of work on dummy regression 
within classical statistics. However, existing dummy regression methods are applicable only 
when all data observations are precise. In practice, statistical data often contains imprecision or 
intervals, rendering classical dummy regression methods unsuitable under these conditions of 
uncertainty. To the best of the author's knowledge, there has been no research on dummy 
regression using neutrosophic statistics. This paper aims to address this gap by proposing a 
dummy regression model under neutrosophic statistics. We will introduce a dummy regression 
framework that accommodates uncertainty and present an analysis of variance within this 
context. Additionally, we will conduct extensive simulation studies and apply the proposed 
regression model to real-world data. Our findings will demonstrate the significant impact of 


varying degrees of uncertainty on the predicted and residual values from the model. 


2. Neutrosophic Random Variables 

Consider two neutrosophic random variables, Xy = X, + X,Iy and Yy = Y, + Y; Jy, each made up 
of two components. The terms X, and Y, represent the determinate parts, akin to those in 
classical statistics. The terms X,Jy and Y;,Iy represent the indeterminate parts, with (yell, /y]) 


denoting the degree of indeterminacy or uncertainty. When Iye[J,,/y], these neutrosophic 
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random variables reduce to classical statistical variables. Assume X, and Y, follow normal 
distributions with means p, and py, and variances o2 and OF, respectively, as stated by [22]. 
Neutrosophic logic extends fuzzy logic, with If = Iy and IX = Iy for n € N. Given this context, 
we outline some properties of these proposed neutrosophic random variables. 

E(Xy) = E(X, + Xi Iy) = (1 + Inu, and EC ¥y) = E(Y, + YiIy) = A + Ty )ly 

Var(Xy) = Var(X, + X,Iy) = (1 + Iy)?o¢ and Var( Yy) = Var(Y¥, + Y,Iv) = (1 + Iy)? oy 

Var(Xy + Yy) = (1+ Iy)?0¢ + (1 + Iy)? 04 


3. Methodology 

As previously discussed, regression with dummy variables is used for categorical variables. 
Under classical statistics, this type of regression can only be applied when data are precise. It is 
not suitable for use in situations involving uncertainty or imprecise data. In this section, we will 
modify the traditional regression with dummy variables by incorporating neutrosophic 
statistics. Our goal is to develop a regression model with dummy variables that remains 
effective even when observations are imprecise or uncertain. Assume the dependent variable 
Yy = Y, + Yyly is a neutrosophic random variable, composed of a determinate part Y, and an 
indeterminate part Yyly, with Iy representing the degree of indeterminacy. The neutrosophic 
regression with two dummy variables is then defined as follows: 

Yi, + Yyly = Bo + BX + BoD: + eijlyel ly],i = 1,2 (1) 
Note that D; represents the category, X is the independent variable, and Bo, B,, and Bz, are the 
intercept and coefficients of the model, respectively. €; is the random error. Let D = 1 indicate 
defective items and D =0 indicate non-defective items, with /,=0. The proposed regression 


model for defective items can be written as follows: 


Y, = Bo + BiX + Bo + €; (2) 
The proposed regression model for non-defective items can be expressed as follows: 
Y, = Bo + BX + €; (3) 


Let Iy = Iy. The proposed regression model for defective items can be expressed as follows: 
Y, + Yylu = Bo + BiX + Bo + €; (4) 
The proposed regression model for non-defective items can be formulated as follows: 


¥, + Yyly = Bo + BiX + €; (5) 
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Note that the proposed regression with dummy variables simplifies to the classical regression 
with dummy variables when /,=0. 
3.1 Neutrosophic Analysis of Variance (NANOVA) 
In this section, we will extend the existing analysis of variance (ANOVA) under classical 
statistics by incorporating neutrosophic statistics. The neutrosophic analysis of variance 
(NANOVA) for regression with a dummy variable is given in Table 1. 

Table 1: NANOVA table 


df Sum of square Mean square error Fy test 
Regression k RSSy RSSy RSSy / ESSy 
k _k /n-k-1 
Residual n—-k-1 ESSy ESSy 
n—-k—1 
Total N-1 RSSy 


Note here that error/residual sum of square, say ESSy is given by 


~ \2 
ESSy = Ln _ Py) (6) 
where Yy denotes the predicted values. 


The regression of sum square, say RSSy is given by 


> ox Nee 
RSSy = E(%y — In) ” 
The neutrosophic F-test is given by 
RSS, 1 ESS RSS,, ; ESS 
Fy = k a ere + et Iyi IveUy Tu] (8) 


The proposed F-test simplifies to the classical F-test when /,=0. 


4. Application 

In this section, we demonstrate the application of the proposed regression with dummy 
variables using neutrosophic data on expected defective item counts from three different 
machines, alongside the hours of operation for each machine. Notably, the anticipated defective 
counts from these machines yield intervals rather than exact figures. These data are detailed in 
Table 2. Notably, the imprecision in the defective item counts renders classical statistical 
regression with dummy variables infeasible. Hence, employing the proposed regression 
method within neutrosophic statistics becomes imperative. Table 3 presents a summary of our 
applied regression with dummy variables, revealing multiple R values ranging from 0.7110 to 


0.7229, and standard error between 4.3262 to 4.2286. Table 4 depicts the analysis of variance 
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(ANOVA), showing F-test values ranging from 2.38 to 2.55, all below the significance level of 
0.05, indicating insignificance in the results. The neutrosophic predicted and residual values are 
delineated in Table 5, illustrating a notable disparity between regression methodologies. Figures 
2-3 further elucidate the discrepancies in predicted and residual values between the proposed 
and existing methods. This study underscores the differential performance of regression with 
dummy variables under conditions of indeterminacy, advocating for the adoption of the 
proposed method in data scenarios fraught with uncertainty. The neutrosophic F-test is given 
by 

Fy = 2.38 + 2.55Iy; Iye[0,0.06667] 


Table 2: The data of number of defectives 


Number of defectives Number of hour Machines 
[6,7] 3 A 
[8,12] 4 A 
[5,8] 3 A 
[9,11] 6 A 
[7,10] 9 B 
[3,6] 2 A 
[12,15] 8 B 
[2,3] 1 Cc 
[14,16] 7 C 
[19,20] 8 B 
[13,16] 3 B 


Table 3: SUMMARY OUTPUT 


Regression Statistics 


Multiple R [0.7110,0.7229] 
R Square [0.5055,0.5226] 
Adjusted R Square [0.2935,0.3180] 
Standard Error [4.3262,4.2286] 


Observations 11 
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Table 4: ANOVA table 


df SS MS F Significance F 
Regression 3 [134,137] [44.63,45.67] [2.38,2.55] [0.1549,0.1385] 
Residual [131,125] [18.72,17.88] 
Total 10 [265,262] 

Coefficients Standard Error t Stat P-value 
Intercept [3.0976,5.6829] [2.9450,2.8786] — [1.05,1.97] [0.3278,0.0889] 
Hours [0.8618,0.8659] [0.6168,0.6028 [1.40,1.44] [0.2050,0.1941] 
Machine B [3.6199,3.5061] [3.5804,3.4996] — [1.01,1.00 [0.3457,0.3498 
Machine C [1.4553,0.3537] [3.6279,3.5461] — [0.40,0.10] [0.7003,0.9234] 
Table 5: RESIDUAL OUTPUT 
Observation Predicted Defectives Residuals 
1 [5.6829,8.2805] [0.3171,-1.2805] 
2 [6.5447,9.1463] [1.4553,2.8537] 
3 [5.6829,8.2805] [-0.6829,-0.2805] 
4 [8.2683,10.8780] [0.7317,0.1220] 
5 [14.4736,16.9817] — [-7.4736,-6.9817] 
6 [4.8211,7.4146] [-1.8211,-1.4146] 
7 [13.6118,16.1159] —_[-1.6118,-1.1159] 
8 [5.4146,6.9024] [-3.4146,-3.9024] 
9 [10.5854,12.0976] — [3.4146,3.9024] 
10 [13.6118,16.1159] = [5.3882,3.8841] 
11 [9.3028,11.7866] [3.6972,4.2134] 


18.0000 
16.0000 
14.0000 
12.0000 


Predicted 10.0000 


Values 


8.0000 
6.0000 
4.0000 
2.0000 
0.0000 


123 45 67 8 9 1011 


Obervations 


=—— = Predicted L 


=——= Predicted U 


Figure 1: The neutrosophic predicted values for the data 


Int. J. Anal. Appl. (2024), 22:114 7 


8.0000 
6.0000 
4.0000 
2.0000 


Residulas 20000 === Residual L 


values = 2.0000 


=— = Residual U 
-4.0000 


-6.0000 
-8.0000 


-10.0000 
Obervations 


Figure 2: The neutrosophic residuals values for the data 


5. Simulation 

This section presents a simulation study examining how the degree of indeterminacy (ly) affects 
key statistics, such as predicted values, residual values, percentiles, and the number of 
defectives. Utilizing data from Table 1 on the number of defectives, we explore various values 
of Iy to observe their impact on these statistics. Predicted values derived from the model are 
reported in Table 6, while Table 7 displays residual values. Percentile values are detailed in 
Table 8, and the number of defectives is outlined in Table 9. An analysis of Table 6 reveals that 
as the degree of indeterminacy increases, predicted values exhibit an upward trend. For 
instance, with I,y=0.1, the predicted value is 9.1085, whereas with Iy=1, it rises to 16.5610. This 
behavior is further illustrated in Figure 3, where the predicted value curves for Iy=0.1 is notably 
lower than for other yvalues. Similarly, Table 7 demonstrates that increasing Iy correlates with 
rising residual values. For instance, with Iy=0.1, the residual value is 4.2134, while with Iy=1, it 
increases to 8.4268. Figure 4 visually represents this trend, with the residual value curve for 
Iy=0.1 notably lower compared to other Jy values. Examining Table 8, we observe minimal 
fluctuation in percentile values with varying degrees of indeterminacy. For instance, when 
Iy=0.1, the percentile value is 4.5455, consistent with the value when /y=1. Figure 5 further 
confirms this stability, with the percentile value curve for Jy=0.1 mirroring that of other Iy 
values. In contrast, Table 9 shows a clear increase in the number of defectives as the degree of 


indeterminacy rises. For instance, with Jy=0.1, there are 3 defectives, whereas with Iy=1, this 
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figure grows to 6. Figure 6 illustrates this trend graphically, with the defective value curve for 
Iy=0.1 notably lower than for other Jy values. In conclusion, this study highlights the impact of 
uncertainty on predicted and residual values within the model. Therefore, decision-makers 
should exercise caution when employing regression with dummy variables in uncertain 


conditions. 


Table 6: Effect on Predicted values when n=11 


Iy =0 | ly =0.1 | ly =0.2 | Iy = 03 | ly =04 | ly =05 | Iy =0.6 | ly =0.7 | ly =08 | ly =0.9 | Iy=1 
8.2805 | 9.1085 | 9.9366 | 10.7646 | 11.5927 | 12.4207 | 13.2488 | 14.0768 | 14.9049 | 15.7329 | 16.5610 


9.1463 | 10.0610 | 10.9756 | 11.8902 | 12.8049 | 13.7195 | 14.6341 | 15.5488 | 16.4634 | 17.3780 | 18.2927 


8.2805 9.1085 9.9366 | 10.7646 | 11.5927 | 12.4207 | 13.2488 | 14.0768 | 14.9049 | 15.7329 | 16.5610 


10.8780 | 11.9659 | 13.0537 | 14.1415 | 15.2293 | 16.3171 | 17.4049 | 18.4927 | 19.5805 | 20.6683 | 21.7561 


16.9817 | 18.6799 | 20.3780 | 22.0762 | 23.7744 | 25.4726 | 27.1707 | 28.8689 | 30.5671 | 32.2652 | 33.9634 


7.4146 8.1561 8.8976 9.6390 | 10.3805 | 11.1220 | 11.8634 | 12.6049 | 13.3463 | 14.0878 | 14.8293 


16.1159 | 17.7274 | 19.3390 | 20.9506 | 22.5622 | 24.1738 | 25.7854 | 27.3970 | 29.0085 | 30.6201 | 32.2317 


6.9024. 7.0927 8.2829 8.9732 9.6634 | 10.3537 | 11.0439 | 11.7341 | 12.4244 | 13.1146 | 13.8049 


12.0976 | 13.3073 | 14.5171 | 15.7268 | 16.9366 | 18.1463 | 19.3561 | 20.5659 | 21.7756 | 22.9854 | 24.1951 


16.1159 | 17.7274 | 19.3390 | 20.9506 | 22.5622 | 24.1738 | 25.7854 | 27.3970 | 29.0085 | 30.6201 | 32.2317 


11.7866 | 12.9652 | 14.1439 | 15.3226 | 16.5012 | 17.6799 | 18.8585 | 20.0372 | 21.2159 | 22.3945 | 23.5732 


Table 7: Effect on Residual values when n=11 


Ty =0 | Iy =0.1 | Iy =0.2 | ly =0.3 | ly =04 | Iy =0.5 | Iy =0.6 | ly =0.7 | Iy =08 | ly =09 | ly =1 
-1.2805 | -1.4085 | -1.5366 | -1.6646 | -1.7927 | -1.9207 | -2.0488 | -2.1768 | -2.3049 | -2.4329 | -2.5610 


2.8537 | 3.1390 3.4244 3.7098 3.9951 4.2805 4.5659 4.8512 5.1366 5.4220 | 5.7073 


-0.2805 | -0.3085 | -0.3366 | -0.3646 | -0.3927 | -0.4207 | -0.4488 | -0.4768 | -0.5049 | -0.5329 | -0.5610 


0.1220 | 0.1341 0.1463 0.1585 0.1707 0.1829 0.1951 0.2073 0.2195 0.2317 | 0.2439 


-6.9817 | -7.6799 | -8.3780 | -9.0762 | -9.7744 | -10.4726 | -11.1707 | -11.8689 | -12.5671 | -13.2652 | -13.963 


-1.4146 | -1.5561 | -1.6976 | -1.8390 | -1.9805 | -2.1220 | -2.2634 | -2.4049 | -2.5463 | -2.6878 | -2.8293 


-1.1159 | -1.2274 | -1.3390 | -1.4506 | -1.5622 | -1.6738 | -1.7854 | -1.8970 | -2.0085 | -2.1201 | -2.2317 


-3.9024 | -4.2927 | -4.6829 | -5.0732 | -5.4634 | -5.8537 | -6.2439 | -6.6341 | -7.0244 | -7.4146 | -7.8049 


3.9024 | 4.2927 4.6829 5.0732 5.4634 5.8537 6.2439 6.6341 7.0244 7.4146 | 7.8049 


3.8841 | 4.2726 4.6610 5.0494. 5.4378 5.8262 6.2146 6.6030 6.9915 7.3799 | 7.7683 


4.2134 | 4.6348 5.0561 5.4774. 5.8988 6.3201 6.7415 7.1628 7.0841 8.0055 | 8.4268 
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Table 8: Effect on Percentiles values when n=11 
Iy=0 | Iy=01 | Iy=0.2 | ly =0.3 | y=0.4 | Iy=0.5 | Iy =0.6 | ly =0.7 | Iy=08 | y=0.9] Iy=1 
4.545455 | 4.5455 | 4.545455 | 4.5455 4.5455 4.5455 | 4.545455 | 4.5455 4.5455 4.5455 4.5455 
13.63636 | 13.6364 | 13.63636 | 13.6364 | 13.6364 | 13.6364 | 13.63636 | 13.6364 | 13.6364 | 13.6364 | 13.6364 
22.72727 | 22.7273 | 22.72727 | 22.7273 | 22.7273 | 22.7273 | 22.72727 | 22.7273 | 22.7273 | 22.7273 | 22.7273 
31.81818 | 31.8182 | 31.81818 | 31.8182 | 31.8182 | 31.8182 | 31.81818 | 31.8182 | 31.8182 | 31.8182 | 31.8182 
40.90909 | 40.9091 | 40.90909 | 40.9091 | 40.9091 | 40.9091 | 40.90909 | 40.9091 | 40.9091 | 40.9091 | 40.9091 
50 50 50 50 50 50 50 50 50 50 50 
59.09091 | 59.0909 | 59.09091 | 59.0909 | 59.0909 | 59.0909 | 59.09091 | 59.0909 | 59.0909 | 59.0909 | 59.0909 
68.18182 | 68.1818 | 68.18182 | 68.1818 | 68.1818 | 68.1818 | 68.18182 | 68.1818 | 68.1818 | 68.1818 | 68.1818 
77.27273 | 77.2727 | 77.27273 | 77.2727 | 77.2727 | 77.2727 | 77.27273 | 77.2727 | 77.2727 | 77.2727 | 77.2727 
86.36364 | 86.3636 | 86.36364 | 86.3636 | 86.3636 | 86.3636 | 86.36364 | 86.3636 | 86.3636 | 86.3636 | 86.3636 
95.45455 | 95.4545 | 95.45455 | 95.4545 | 95.4545 | 95.4545 | 95.45455 | 95.4545 | 95.4545 | 95.4545 | 95.4545 
Table 9: Effect on Number of defectives when n=11 
Iy =0 | Iy =0.1 | ly =0.2 | ly =0.3 | ly =0.4 | Iy =0.5 | Iy = 0.6 |] Iy =0.7 | Iy =0.8 | Iy =0.9 | Iy =1 

3 3 4 4 4 5 5 5 5 6 6 

6 di 7 8 8 9 10 10 11 11 12 

7 8 8 9 10 11 11 12 13 13 14 

8 9 10 10 11 12 13 14 14 15 16 

10 11 12 13 14 15 16 17 18 19 20 

11 12 13 14 15 17 18 19 20 21 22: 

12 13 14 16 17 18 19 20 22 23 24 

15 17 18 20 21 23 24 26 27 29 30 

16 18 19 21 22 24 26 27 29 30 32 

16 18 19 21 22 24 26 27 29 30 32 

20 22 24 26 28 30 32 34 36 38 40 
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Figure 4: The neutrosophic residual values for the simulated data 
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Figure 5: The neutrosophic Percentiles values for the simulated data 
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Figure 6: The neutrosophic number of defectives for the simulated data 


6. Comparative Study 

In this section, we present the results obtained using the proposed regression model with 
dummy variables and compare them with the regression model with dummy variables under 
classical statistics. As previously mentioned, the proposed regression model reduces to the 


classical regression model with dummy variables when there is no uncertainty, i.e., 1,=0. The 
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results for the existing regression with dummy variables are reported in Tables 6-9. These tables 
show an increasing trend in the predicted values, residual values, and the number of defectives 
as Iy increases from 0 to other values. For instance, when /,=0, the predicted value from Table 6 
is 11.7866, and when Iy=0.20, the predicted value is 14.1439. Similarly, the residual value from 
Table 7 is 4.2134 when [,=0, and 5.0561 when Iy=0.20. Additionally, the number of defectives 
from Table 9 is 20 when /,=0, and 24 when Iy=0.20. These trends in predicted values, residuals, 
and the number of defectives are illustrated in Figures 7-9, which show that the curves for [,=0 


are consistently lower than those for Jy =0.20. 


Predicted 
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Residual values 


Obervations 


Figure 8: The residual values from the proposed and predicted values 
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Figure 9: The number of defectives from the proposed and predicted values 


7. Concluding Remarks 

In this paper, we introduced a regression model using dummy variables within the framework 
of neutrosophic statistics. This proposed model is designed for regression analysis under 
conditions of uncertainty, extending the classical regression model with dummy variables. We 
demonstrated the application of our model through simulation and comparative studies, 
showing that the results differ from those obtained using classical regression. Our findings 
indicate that the degree of uncertainty significantly impacts the predicted and residual values. 
We recommend that decision-makers in fields such as metrology, business, industry, medicine, 
and education apply this regression model cautiously when dealing with uncertainty. The 
proposed regression model with dummy variables is suitable for uncertain environments, and 
future research could explore other regression models using this method. 
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